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Abstract — The problem of decentralized sequential detection 
with conditionally independent observations is studied. The 
sensors form a star topology with a central node called fusion 
center as the hub. The sensors make noisy observations of a 
parameter that changes from an initial state to a final state at 
a random time where the random change time has a geometric 
distribution. The sensors amplify and forward the observations 
over a wireless Gaussian multiple access channel and operate 
under either a power constraint or an energy constraint. The 
optimal transmission strategy at each stage is shown to be the 
one that maximizes a certain Ali-Silvey distance between the 
distributions for the hypotheses before and after the change. 
Simulations demonstrate that the proposed analog technique has 
lower detection delays when compared with existing schemes. 
Simulations further demonstrate that the energy-constrained 
formulation enables better use of the total available energy 
than the power-constrained formulation in the change detection 
problem. 

Index Terms — Ali-Silvey distance, change detection, correla- 
tion, Markov decision process, multiple access channel, sequential 
detection, sensor network 



I. Introduction 

Consider the use of a wireless sensor network for detection 
of a disruption or a change in environment. The change is 
required to be detected with minimum delay subject to a false 
alarm constraint. The standard medium access control and 
physical layer design for such a network (e.g., IEEE 802.15.4 
standard) is one where sensors quantize their observations and 
send them to a fusion center via random access over a wireless 
Gaussian multiple-access channel (GMAC). The transmitted 
data are typically quantized individual log-likelihood ratios 
(LLR) of the hypotheses representing the environment before 
and after the change. The fusion center collects each sensor's 
LLR and adds them to get a fused statistic, if observations 
at sensors are independent conditioned on the state of the 
environment; this would be the case when the observation 
noises are additive and independent from sensor to sensofl 
Such a design has a few drawbacks. 

1) It does not exploit the spatial correlation in observations 
across sensors. 

2) It does not exploit the superposition available on the 
GMAC. 
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'As we will see later, conditional independence notwithstanding, sensor 
observations are correlated. 



3) It employs an ad hoc separation between quantization or 
compression on one hand, and transmission across the 
channel on the other; the latter requires adequate coding 
for noiseless reception and correct further processing at 
the fusion center 

4) It requires sufficient time slots for sensors to resolve all 
channel contention^ 

Our goal in this paper is to detect change in environment in a 
manner that addresses the aforementioned drawbacks. Specifi- 
cally, we consider a "star" topology of sensors. Sensors make 
an affine transformation of the observed data and transmit 
the output in an analog fashion over the GMAC. Given that 
observations at sensors at any instant are spatially correlated, 
only the sum of the LLRs is relevant to the decision maker, 
i.e., it is a sufficient statistic to decide on the change. By 
making the sensors simultaneously transmit an affine function 
of their LLRs in an analog fashion, and via distributed transmit 
beamforming, we exploit the spatial correlation in sensor data 
and the superposition available on the GMAC - the channel 
computes the required sum. Moreover, the analog data is in 
loose terms matched to the channel and does not require 
explicit channel coding. Finally, the sum is available at the 
fusion center in a single transmit duration unlike the situation 
in the random access case. 

The biggest challenge in our proposed technique is the prac- 
ticality of distributed transmit beamforming. The transmitters' 
clocks should be synchronized to some extent, so that carrier, 
phase, and symbol ticks align. A technique similar to the 
master-slave architecture proposed by Mudumbai, Barriac & 
Madhow [1] can be used to achieve this synchronization. The 
scheme exploits channel reciprocity in a time-division duplex 
(TDD) system. 

1 ) Organization and preview of main results: In Section 
im we formulate and solve a change detection problem under 
a power-constrained setting We arrive at a Markov decision 
problem framework and show that parameters of the affine 
transformation should minimize the variance of the combined 
observation and GMAC noises, which turns out to be a non- 
convex optimization problem. We then provide an explicit al- 
gorithm to compute the optimal control parameters. SectionHU 
considers an energy-constrained setting. Section HV] compares 

^Alternatively, a time-division multiplexing protocol needs as many slots 
as there are sensors, and does not scale with the number of sensors. 

^Sensors are usually powered by batteries with a fixed energy. The power- 
constrained model arises when this energy is evenly split over the desired life 
time of the sensor (in samples). An energy-constrained model arises when 
there is flexibility in how this energy is expended from sample to sample 
(subject to, of course, constraints imposed by the power amplifier). 
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the simulation performance of our scheme with a previously 
known scheme. It also compares the energy-constrained for- 
mulation of Section|III]with the power-constrained formulation 
of Section Appendix U contains a new characterization of 
optimal control: maximize a certain Ali-Silvey distance [2] 
between the distributions of the fusion center's observation 
before and after the change. This is used to arrive at the 
minimum variance criterion of Section 

2) Prior work: Change detection problems were solved in 
a centralized setting by Page [3], Lorden [4], and Shiryayev 
[5]. Shiryayev considered a Bayesian setting which is of 
relevance to our work. Veeravalli [6] solved the decentralized 
version of this problem with parallel error-free bit pipes of 
limited capacity from the sensors to the fusion center and 
identified the optimal stopping policy and quantizer structure. 
These results are analogous to those for hypothesis testing 
and sequential hypothesis testing (Tsitsiklis [7], Veeravalli et 
al. [8]). Prasanthi [9] considered access and decision delays 
in sequential detection over a random access channel, as it 
would be practically implemented using, for example, the 
IEEE 802.15.4 wireless personal area network standard. (See 
also [10]). Our work differs from those of Prasanthi and 
Veeravalli because we propose an analog transmission strategy. 

Analog transmissions are optimal for transmission of a 
single Gaussian source over a Gaussian channel (Berger [11, 
p. 100]) and a bivariate Gaussian source over a GMAC for a 
certain range of signal-to-noise ratios (SNR) (Lapidoth and 
Tinguely [12]), when a running estimate is required. Analog 
transmission via waveform design was considered by Mergen 
and Tong [13]. They used "type-based" multiple access to 
estimate a parameter over a GMAC. Their scheme, as does 
ours, exploits the superposition available in the GMAC. (See 
also [14], [15], [16], [17], [18], and [19] for analog trans- 
mission in other settings). Ertin and Potter [20] considered 
generalized cost functions which is mathematically analogous 
to our energy-constrained formulation. 

II. Physical Layer Fusion Framework 

A. Mathematical Formulation 

X ^ JV{0,a'^) indicates that X is a Gaussian random 
variable with mean 6 and variance cr^. 

(1) The state of nature is described hy {9k : k G Z+}, a two- 
state discrete-time Markov chain taking values in {mo, mi}, 
with transition probabilities as described in Fig. [na)-(b). The 
quantities mo and mi denote, for example, the mean level of 
the observations before and after the disruption. The initial 
distribution for this Markov chain is obtained from Pr{6'o = 
mi} = I'. The change time F is Z+ -valued, and given the 
event {F > 0}, F has the geometric distribution. 

(2) The network has L sensors. At time fc, sensor Si makes 
an observation Xi^k ~ U{dk,(jl^^^ i), i.e., Xi^k = Ok + Zi^k, 
where Zi,k ~ Af{0, cTobs,;)' ^ = 1, . . . , L. 

(3) The observations at each sensor are independent, con- 
ditioned on 9k- Furthermore, the observations are independent 
from sensor to sensor, conditioned on 9k- Despite these con- 
ditional independence assumptions, we remark that Xi^k,l — 
1, - - - L, are correlated. 
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Fig. 1. Problem set-up. 

(4) Each sensor transmits Yj fe = 4'i,k{Xi,k)', this being a 
function only of the observation at sensor I, our setting is a 
decentralized one. See Fig. [TJc). The function ^ is affine: 

(f>i.k{x) = ai^k{x - ci^k)- (1) 

Quantities ak = {ai,k, ■ • ■ , aL.fc) and Ck = (ci,fe, . . . , CL^k) 
are parameters for optimal control. Transmission is done by 
setting the amplitude of an underlying unit-energy waveform 
to Y; fc. All sensors use the same underlying waveform. The 
motivations for the analog amplify-and-forward transmissions 
in ([T]i are given in Section II conditional independence of 
the observations given the state, and the Gaussian observation 
noise. If the latter does not hold, affine functions of LLRs 
instead of the direct observations could be sent ([21, Ch. 5]). 

(5) The GMAC output at the fusion center when projected 
onto the common waveform yields 

L 
1=1 

where Zmac.A; ^ -'^(Oi "'mac) independent and identically 
distributed (iid) across fc, and is independent of all other 
quantities. The gain hi e M+ is the channel gain for the /th 
sensor and is deterministic. See Fig. [Uc). We assume perfect 
knowledge of the channel gains is available at the sensors and 
the fusion center. While this is not the case in practice, channel 
knowledge can be gleaned in time-division duplex (TDD) 
systems that possess channel reciprocity (IEEE 802.15.4). See 
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Mudumbai, Barriac & Madhow [1] for a suggested master- 
slave architecture. In a subsequent section, we study the effect 
of imperfect knowledge of these grains. 

(6) At the fusion center, form Yk as follows: 



— Ok 



where Zmac,, 



M{0,al) and 



Yk+J2f^- 



lOLl.kCl.k 



1 = 1 



MAC 



(2) 



(3) 



The quantity Yk in (|2]l is obtained from Yk using a bijective 
mapping; so no information is lost. From we also see 
that the distributed multi-sensor setting is equivalent to a 
centralized setting where the fusion center makes a direct 
(noisy) observation on Ok with equivalent additive observation 
noise of variance as given in (O. This is enabled by the 
affine nature of 0/.^. The centralized problem with constant 
was studied by Shiryayev [5] with the aim of characterizing 
the stopping rule. The new aspect here is the dependence of 
cr^ on the control parameters. 

(7) The fusion center chooses an action a^-i G A at time 
k — 1 from set A of actions (controls) 



A = {stop} U {{continue, 



a £ 



If afc_i — stop, the fusion center stops. If afc_i — 
{continue, ak,Ck), the fusion center takes another sample 
(the fcth), and all sensors transmit 4>i^k{Xi,k) with parameters 

{ak, Ck)- 

(8) As done by Veeravalli in [8], we assume a quasi-classical 
information structure, i.e., action ak-i depends on 



ik-i = {ao,yi,ai,y2, . . .,ak-2,yk-i} ■ 



(4) 



Even though the sensors may have local memory of past ob- 
servations, our framework does not make use of this additional 
information^ The fusion center feeds back the action param- 
eters afc-i to the sensors. (We use the following notation: the 
quantity ik-i in (|4|l is a realization of the random variable 
Ik-i and takes values in the set Ik-i- We set Iq = 0). 
(9) Average power constraint at sensor / is 



E 



Cl,k) 14- 



i.e.. 



H,k 



Clk) \Ik-i 



<PI, 



<Pu 1, 



(5) 

The set of feasible controls, given Ik-i — ik-i, is denoted by 

A(ifc_i) = 

{stop} U {{continue, a, c) : {a, c, ifc-i) satisfies dU}. (6) 

In Section Hill we relax the constraint in (|5]l and impose an 
expected total energy constraint. 

^Veeravalli [8, p. 434] discusses other information structures and why they 
may be difficult to analyze. 



(10) The fusion center policy vr is a sequence of proposed 
(deterministic) actions tt = {-Kk-i^k > 1), where iTk-i is 
a function iTk-i : Ife-i A. In particular, TTk-i{ik-i) = 
ak-i £ A(ifc_i). Each policy vr induces a probability mea- 
sure. All expectations are with respect to this measure. The 
dependence of the expectation operation on tt is understood 
and suppressed. 

(11) T is the first instant when the fusion center decides to 
stop. 

The problem we wish to solve is the following: 

Problem 1: (Change detection with delay penalty) Min- 
imize over all admissible policies the expected detection delay. 



DD 



(r-rr 



subject to an upper bound on the 
probability of false alarm PpA < where = max(0,a:), 
and Pfa = Pr{T < T}. 

The solution to Problem [T] is obtained via a solution to 
Problem 12] (below) for a particular A > (Shiryayev [5]). The 
quantity A may be interpreted as the cost of unit delay. 

Problem 2: ( Change detection with a Bayes cost) Minimize 
over all admissible policies 



R{\) 



Pfa + ^Eoo = Pr{r > r} + AE (r - T 



E 



lR = mo} + ^Al{0fc = mi} 



fc=0 



(7) 



where A > and E is under the probability measure induced 
by the chosen policy. 

The cost function is additive over time. The first term within 
the expectation in (|7]i is the terminal cost; the terms in the 
summation a running cost. At each stage the state 6k evolves 
in a Markov fashion. The controller sees only a noisy version 
Yk of the state, but can control the observation noise variance 
af, via a and c. It can also stop at any stage and pay a terminal 
cost. Any decision affects the future evolution of the cost 
process. Such problems are Markov decision problems (MDP) 
with partial observations. They can be analyzed by studying 
an equivalent complete observation MDFjfl with a reduced 
(posterior) state ^k = E [HOk = nii} \ Ik] = Pr{r < k \ Ik}. 
The probability law for {fik : fc > 0} is given as follows: 
fiQ — Pr{r < I Iq} — V, and the law for [ik, under 
ak = {continue, ak+i, Ck+i), is (see Veeravalli [6, eqn. (9)]) 



fJ-k+i 



PkfmiMk + i (^fe+l) 



A 



Pkfmi,ak+l (Yk+lj + (1 /3fc)/mo,Qfc + i 

g (Yk+i,ak+i,pk) ^ 
h(Yk+ 



Y, 



k+l 



l,Ctk+l,f-k 



= t/j [Yk+i,pk,ak+i 



(8) 



A 



where f3k = Pr{r < fc + l\Ik} = Pk + (1 - t^k)?, and 
fnii.ak+i is the density of an M{nii,crf._^-^) random variable. 
The quantities h and g are as in ([8]l; h is the density of Yfc+i 



^See Shiryayev [5], Veeravalli [6] for results with stopping, Bertsekas & 
Shreve [22, Ch. 10] for discounted costs, and Bertsekas [23, Ch. V]. 
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given {Ik, flfc), and 5 is a scaled density. The power constraint 
(Is) when written for time fc + 1 simpHfies to 



^obs,i + (^"0 - Q,fc+i)^(l - /3fc) 

+ {mi-ci,k+ifpk]<Pi. (9) 

The set of feasible controls in (|6]l depends on only through 
/Ltfc and can be simplified to 

A(/Lt) — {stop} U {{continue, a, c) : 
(a,c, /x) satisfies (|9])}, 

where A(-) is re-used to denote the set of feasible controls 
for the equivalent complete observation MDP. Let A'{fi) — 
{(a,c) : {continue, a, c) G A(/i)} denote the set of control 
parameters when the action is to continue. Now consider 
the objective function. Taking conditional expectations with 
respect to the information process, (see Shiryayev [5, pp. 195- 
196]), © reduces to 



i?(A) = ; 



(1 - Hr)+Y^ XHk 



k=0 



(10) 



Minimization of ( fTOl i is done via dynamic programming. Some 
additional remarks are in order 

Remarks: 1. The variance depends on ak+i as shown 
in (|3]l, and hence the dependence on ak+i in (O. /ifc+i depends 
on Cfc+i only through ak+i because of the processing done in 
@. 

2. If the running cost is A instead of Xl{Ok — mi} in (|7]l, 
every sample costs A units, not just those beyond the change 
point that contribute to the delay. This is a minor variation to 
Problem |2] and has a similar solution. 

3. Another variation is sequential hypothesis testing: set 
the transition probability p = 0, enhance the action stop to 
{stop, 6), where 6 is the decision (either mo or mi), and set 
the terminal cost to l{dr 7^ 0}. The running cost is a constant 
A for every sample. 

B. Optimal Policy 

As is usual with such problems, we first restrict the stopping 
time T to a finite horizon T. Using Bertsekas's result [23, Ch.l, 
Prop. 3.1], the cost-to-go function recursions are written as 

JrilJ-r) = 1 - MT, 

min {1 - ^fe, A^fe + (^fc)} , < fc < T, 



Jkil^k) 
Al{l.) 



mm J 
(Q,c)eA'(/i) 

min 

(a,c)GA'(/i) 



7^ 

Jh. 



7^ 



1 (V' 



, X ,h{y,a,n)dy. 

.h{y,a,^i)J 

To solve Problem|2] let T 00. From results in [8] and [6], 
the limit in ( fTTT i below exists, does not depend on k (i.e., the 
policy is stationary), and defines the infinite horizon cost-to-go 
function: 



J(^) = lim 4 (^) = min{l - fi,Xfi + Aj{n)} , 

1 -^00 



where 



mm J 

(a,c)eA'(p) 



J 



Y , jjL, a 



(11) 



(12) 



The following lemma enables a characterization of the optimal 
stopping policy. 

Lemma 1: The functions {pi') and A'^{^) are non-neg- 
ative and concave functions of /i, for fi S [0,1]. Moreover, 
Al{l) = Jj(l) = 0. Similarly, the functions J{fi) and Aj{fj.) 
are non-negative and concave functions of fi, for fi E [0, 1], 
and Aj{l) ^ J(l) = 0. 

The proof is the same as that in Bertsekas [23, p. 268] for 
sequential hypothesis testing. The concavity of Aj{ii) and (fTTT l 
imply the following theorem (Shiryayev [5], Veeravalli [6]). 

Theorem 2: An optimal fusion center policy has stopping 
time T given by r = inf{fc : /Xfc > /i*}, where /i* is the unique 
solution to A/i + Aj{^) — 1 ~ fi. 

To summarize, the optimal detection strategy at time k is 
as follows. Convert the received signal Yk into the posterior 
probability of change /i^ using ^ and If /i^ exceeds 
a threshold, declare that a change has occurred. Otherwise, 
make the sensors transmit another sample using parameters 
a, c chosen optimally as described in the next subsection. 

C. Parameters for Optimal Control 

We begin this section with an algorithm that calculates the 
optimal a. 

Algorithm 1: Let 

^obs,l^iQ^max4 < ■ • • < CTobs^L^^iQ^max^L, 
where the quantity 

1/2 

"max,; = [Pll (o-QbsJ + ("^1 - '>nofl3{l - 13))) 

with /3 = /.t + (1 — ^)p. 

• Step 1: Find the unique fc e {1, . . . , L — 1} that satisfies 

YA=i{^oh&jhiama)i.d)^ + ct^aC 



^obs.fc^^fc^rnax.fc < 



— '''obs.fe+l^fe+i'^iTiax./c+l (13) 



if it exists. Otherwise, set k — L. 
Step 2: Set the optimal a as follows. 

fe -2 
* , , l^i=k+l "obs.l 
a = 2^ "'"max,; + 



1=1 



y^,{crobs,lhiamax,l)^ + (^MAC ' (1^) 



a,n = amax,m, 1 < m < fc, 



1 a* - J2l=i hiamay^j 



^ObS.m"™ l^l=k+l 



fc < 



+ 1 '^obs,; 



(15) 



The optimal choice sets amplitudes of the fc sensors with the 
fc least scaled observation noise variance (cTgi^g ;/i;Q:max,;) to 
amax.i- The remaining sensors' amplitudes are appropriately 
chosen smaller values. Intuitively, sensors I — k + 1, . . . , L 
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have so good a channel that scaling by amax.z for these sensors 
will amplify the observation noise leading to a larger overall 
noise variance. Note that when all channel gains, observation 
variances, and power constraints are equal, ai = amax for all 
sensors. This special case was earlier proved in [24]. 

Theorem 3: The choice of q = mif3 + mo{l — f3), I = 
1,...,L, and a according to Algorithm [T] constitute the 
optimal controls that minimize (fT2] i. 

Proof: Step 1: We prove that the optimal control 
minimizes the variance (O. Consider a and a' with resulting 
variances < a'^. From the second equality in (|2|i we have 

(16) 



Y{a') 



g'Z 



Y(a) = 9 + cfZ, 

- + aZl + ((T'2-a2)l/2^2, (17) 



where Z, Zi, Z2 are iid 7V(0, 1) with Z' = Zi + Z2. The 
time index k is understood. 

From ( fT6] l and ( fT7] i. Y{a') is a stochastically degraded 
version of Y{a) and is equivalent to an additional random 
processing on Y{a). Theorem |5] in Appendix U shows that 



9 {y{a),a,^lj 
h ( Y{a), a, /i) 



1-E,, 



is an Ali-Silvey distance between two probability measures. In 
Eh the dependence of h on a is understood and suppressed. 
Ali-Silvey distances have a well-known monotonicity prop- 
erty: data processing, whether deterministic or random, cannot 
increase the dissimilarity measure between two distributions 
([2], [25]). This property impUes that 



,hY 



i(y(a'),a',M), 



It follows that minimization of the variance in (O is the 
criterion for getting the optimal a. 

Step 2: We now identify the optimal c. The minimization 
mentioned in the previous step should be done subject to the 
power constraint given in (|5]), which can be rewritten as 



a 



Lk 



< Pi 



^obs.i 



E 



% - ci,kf 14- 



(18) 



The constraint set is enlarged if the upper bound in ( fTSl l is 
higher We should therefore choose the c/ j,, that minimizes 

E {9k — ci^kf \Ik-i , i-e., ci^k is the minimum mean squared 
error (MMSE) estimate of 9k given Ik-i - Clearly this is given 

by Q,fc = E[6'fe|/fc_i] = miPk-i + "^-0(1 - Pk-i), and is 
independent of I. Moreover, 

E[(0fc-c/,fcf|/fe_i] = Var{0fc|/fc_i} 

= (mi - mo)^/?fc_i(l - 

and dTSI ) can be written as ai^k < ctmax.i.k, where 

9 1/2 
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Fig. 2. Performance curves: 1) Clipped transmission via a sigmoidal function 
2) Affine transformation 3) Centralized, where all sensor data is available 
without noise at the fusion center. 



Step 3: Ignoring the time index fc, the optimization problem 
to obtain the best a is: 

Problem 3: Minimize 

-2 r r 




11=1 

where ai e [0, amax.i] for / 1, • • • , L. 

This is not a convex optimization problem. However, we 
can split it into two simpler convex optimization problems to 
get an explicit solution to Problem [3] 

Lemma 4: Algorithm [T] solves Problem [3] 

See Appendix HI] for a proof. This concludes the proof of 
Theorem [3] ■ 

Under the restriction of affine controls, Theorem|3]describes 
the optimal choice. However, affine controls are not optimal 
in general. This is demonstrated in Fig. 12] where a piece-wise 
linear sigmoidal control outperforms the optimal affine control 
(see [21, Sec. 2.7]). It would be interesting to see if there are 
ranges of a^^^^ ; and cr^ys^Q where the affine control is indeed 
optimal. We do not pursue this question in this work. 

We now make some remarks on the complexity of overall 
detection. Theoreml3]says that the parameters for optimal con- 
trol are obtained via a finite step procedure. Indeed, Algorithm 
[T] gives the output in time linear in the number of sensors, and 
is therefore easy to execute. The threshold calculation for a 
fixed set of parameters is a one time calculation and is obtained 
via the so-called value iteration procedure which yields an 
approximation. We now explore further simplifications with 
reduced feedback information. 

D. A Simpler Suboptimal Policy 

Let us now restrict the controls to be of the following form: 
the decision to stop or continue, say hk, depends on Ik, but 
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the parameters of the affine transformation at time fc + 1 can 
only depend on Iq and bk € {stop, continue}. Iq denotes the 
prior information before any observations are made and is 
the decision of the fusion center at k. Note that this reduces 
the amount of feedback to simply the binary random variable 
bk- 

The structure of the controls is similar to that of the optimal 
policy of the previous section, but with 

Pk = Pr{r < fc + l|/o} = 1 - (1 - 

so that (a, c) depends on only Iq and not on Ik- The stopping 
policy is chosen as in Theorem |2] As we see in simulation 
results presented in Section IIVI the performance of this 
algorithm is close to optimal for the chosen parameters, yet 
requires feedback of only one bit at each stage. 
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Veeravalli 
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III. Energy-Constrained Formulation 
The energy-constrained problem is stated as follows. 

Problem 4: Minimize the expected detection delay, Edd, 
subject to an upper bound on the probability of false alarm, 
Pfa < 5, and an upper bound on the expected energy spent. 



E 



.fe=i 



<Ei, ; = 1,2,...,L. (19) 



Let A = (Ai, . . . , \l, Al+i). As before, to solve Problem 
m we set up the Bayes cost -R(A) and minimize it over all 
admissible choices of stopping policy and the parameters a; ^ 
and Qjt of the affine transformation 0; The Bayes cost can 
be written as 



-R(A) = eUi - fir) + Xl+iJ2 



fe=0 

+ Y.T.^^^Hk{Xi,k-ci^k?\h-i] 

k=l 1=1 

A result analogous to Theorem |2] in Section III-BI holds, and 
the optimal control at time fc + 1, given Ik, is such that Ck+i 
is independent of I, the sensor index. More precisely, 

Ck+i=mil3k+mo{l- (3k), l^l,...,L, 



Ctk+l 



arg mm 



.1=1 



9 (y,«,Mfc) 

h{y,a,fj.k) 



h{y,a,fik)dy 



where J(/i) = min {1 — /i, A^+i/i + , is the infinite 

horizon cost-to-go function with 

L 

AJ{^l) = mill [VA^af (CT2j^3^; + (TOi-mo)2/3(l-/3)) 



+ 1 = 1 



J 



'{v-,a,fJ.) 



h (y, a, n) dy 



h{y,a,ii) ^ 

A minimizing control a does exist as is shown in [21, Sec. 
3.1]. 



Fig. 3. Comparison of our algorithms with Veeravalli's scheme. The 
"centralized" performance curve is for the case when all sensor data is 
available without noise at the fusion center. 



IV. Comparisons and practical considerations 

A. Benefits from Exploiting Sensor Correlation 

Veeravalli [6] addresses the structure of optimal £)/ -level 
quantizer at sensor Si,l ^ 1,2, ... ,L. His model is applicable 
to a system that allows log2 Di bits to be sent error-free from 
sensor Si to the fusion center. For simplicity let = D,l ^ 
1,2, ... ,L. In order to show the benefit of exploiting corre- 
lation of observations when transmitting across the GMAC, 
we do the following. The quantized bits from the sensors in 
Veeravalli's scheme are transmitted using an optimal scheme 
designed for independent data streams over a coherent GMAC. 
If all sensors operate at the same transmission power, the SNR 
required to support such a transmission on the GMAC satisfies 
the sum rate constraint L log2 D < (1/2) log2 (1 + L • SNR), 
and thus 



SNR > 



D 



2L 



1 



(20) 



For the simulations, we assume two sensors (L = 2) with 
equal gains, i.e., hi — 1 for / 1, 2. We also assume one-bit 
quantizers (D ~ 2). From (|20] | we get SNR > 7.5. Algorithms 
operate at SNR = 7.5 with Pi ^ 7.5 for / = 1, 2 and ct'^^q ^ 
1. We now summarize the other simulation assumptions which 
will be used unless stated otherwise. 



Simulation Setup 1: Consider L — 2 sensors with JV{Q, 1) 
and Af{0.75, 1) observations before and after the change, 
respectively. The geometric parameter p = 0.05 and the initial 
probability of change ly — 0. We obtain J{fi) via value 
iteration procedure until the difference between successive 
iterates falls below 0.0001 with 1000 points on the n axis. 
All simulations assume Pi = P and cr^^^^ ^ = 1 for I = 1,2. 

Fig-Hshows that both our algorithms give lesser delays than 
Veeravalli's algorithm that is naively overlaid on the GMAC. 
Furthermore, the suboptimal policy of Section III-DI degrades 
from that in Section lTl-Bl only for low false alarm probabilities. 

In Veeravalli's algorithm, D—1 thresholds (S K^~^) and a 
decision to stop or continue are fed back to each sensor. Our 
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Mean Delay Mean Delay 



Fig. 4. Performance curves for channel SNR = oo,3,0, —3 dB. Fig. 5. Performance curves for observation SNR = —1, —2.5, —4 dB. 



scheme requires feedback of ai G K+ , c; G R, and the binary 
decision. Even simpler is the strategy in Section III-Dt only a 
binary decision is fed back. 

The network delay is independent of the number of sen- 
sors in both our algorithms; the performance improves with 
increasing number of sensors. Veeravalli's scheme on the other 
hand requires an exponential growth in SNR (with L, as in 
( |20] |) to maintain the same delay versus PpA performance. 
Our algorithms need a higher level of time and frequency 
synchronization of the transmitters for beamforming. Section 
IIV-DI studies the effect of lack of perfect channel knowledge. 
Transmit beamforming can be achieved via uplink-downlink 
reciprocity in a static time-division duplex (TDD) system (see 
[1] for an example mechanism). 
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B. Performance Comparisons Under Different Channel and 
Observation SNRs 

We now portray performance under three different settings. 

• Fig- SI shows performance for various channel SNRs 
P/fTy^Q; the other parameters remain as in Simulation 
Setup n 

• Fig- 13 shows performance for various observation SNRs 
(mi— TOo)^/crg(jg when the channel SNR P/ct^^q is fixed 
at 3 dB. 

• Fig. |6] compares the symmetric and asymmetric channel 
gain cases. The symmetric curve is obtained with hi = 1 
for I = 1,2, and the asymmetric one with hi ^ 1 and 
h2 = 0.75. The weaker sensor is 2.5 dB lower than the 
stronger one. 

The plots show graceful degradation with decreasing SNR with 
results along expected lines. 



Fig. 6. Performance curves wlien 1) centralized (no channel noise) 2) 
symmetric channel gains 3) asymmetric channel gains with the weaker sensor 
2.5 dB lower. 

then compare the delays incurred by the optimal algorithm 
under the two formulations in Fig. [T] We use the parameters 
in Simulation Setup [T] and hi = 1 for all sensors. For the 
same Pfa, the energy-constrained solution declares a change 
with lesser delay than the constant power solution. 

As an illustration, we plot in Fig. [8] the variation of a^, c, 
and /i with time in both the algorithms for a representative 
sample path. The change point is at 21 samples, shown using 
a dotted vertical grid line. The energy-constrained solution is 
more energy efficient because it uses lower energy (a^) before 
and higher energy after the change point. Indeed, based on the 
prior information, the first few samples use negligible energy. 



C. Comparison of Power- and Energy-Constrained Formula- 
tions 

For Pfa < e^"*, we first identify the minimum time to detect 
change as a function of the energy constraint. This yields 
a power constraint for the constant power formulation. We 



D. Channel Estimation Errors 

Thus far we assumed a static channel with perfect knowl- 
edge available at both transmitter and receiver Wireless chan- 
nels, however, change with time. Only an estimate of the 
channel, based on signal processing on the pilots, beacons. 
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Fig. 7. Comparison of constant power method and energy-constrained 
method. 
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Fig. 8. c? , c, and pL of constant power method and energy-constrained 
method for a sample path. 

or preambles, may be available. In this section, we study the 
effect of imperfect channel knowledge on the physical layer 
fusion algorithm. 

To arrive at a model for channel errors, we consider complex 
channel gains over the GMAC with noise given by Zmac.A; ^ 
CAf{0, c^^q), a circular symmetric complex Gaussian random 
variable. The observations are real-valued, but the complex 
baseband equivalent signal has two real-valued degrees of 
freedom per sample, leading to a bandwidth expansion factor 
of two. Suppose that the sensors use transmit beamformin^ 
i.e., ai ~ jlTf'^'- Then it is sufficient to preserve only the 
real part of the received signal at the fusion center, and the 
problem reduces to that studied in the earlier parts of this paper 
with cr^Ac replaced by ct^ac/S in Section [III The quantity 7; 
replaces ai and \hi \ replaces hi in Algorithm[T] The output of 
the algorithm is 7;. 

Let {hi} be a sequence of CA/'(0, 1) random variables that 

'The optimality of cooperative transmit beamforming by sensors remains 
an open question. 



Fig. 9. Performance curves comparing the cases when 1) channel is perfectly 
known 2) MMSE estimates are used (Pilot SNR = Channel SNR). 




2 4 6 a 10 12 2 4 6 a 10 12 

Mean Delay Mean Delay 



Fig. 10. Performance curves comparing the cases when 1) channel is perfectly 
known 2) MMSE estimates are used (Pilot SNR is 8.75 dB lower than Channel 
SNR). 



obey a block-fading model, i.e., the channel remains constant 
for T uses and then changes to an independent channel gain. 
If K of these T samples are available for channel estimation, 
then the MMSE estimate of the channel is hi — (hi+rZ) / (1 + 
r^), where r = ctmac/ VKPi, Pi is the power of sensor I and 
Z ^ CA/'(0, 1). This is estimated at both ends (using TDD 
system's channel reciprocity). 

Figures |9] and [To] show performance of the policy of Section 
Ill-Cl with h used in place of actual h, across different channel 
SNRs. Simulation Setup [T] parameters are used. K — 1, i.e., 
only one sample pilot is used for channel estimation so that 
transmit beamforming is only loosely enabled. The pilot SNR 
equals the channel SNR in Fig. |9] and is 8.75 dB lower in 
Fig. [To] The top-left subplot in Fig. [9] shows that the transmit 
beamforming scheme with estimation errors is indeed superior 
to Veeravalli's scheme on a coherent GMAC. Fig.[TO]shows no 
benefit because the pilot SNR is not sufficient. Other subplots 
show graceful degradation with decreasing SNR. 
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V. Summary 

We considered the use of an analog transmission strategy 
via an affine transformation in order to exploit correlation in 
the sensor observations. The goal was to detect a change with 
minimum expected detection delay given an upper bound on 
the false alarm rate. We modeled the problem as a Markov 
decision problem with partial observations. We characterized 
the optimal control as one that maximizes an Ali-Silvey dis- 
tance between the two hypotheses before and after the change 
(Appendix IJl. In the GMAC setting, the optimal strategy 
minimizes the error variance of an equivalent observation 
at the fusion center. We then gave an explicit algorithm to 
identify the optimal control parameters. 

We also studied a suboptimal policy that traded performance 
for quantity of information fed back. We then demonstrated 
via simulation the performance gain achieved by our algorithm 
over another scheme that makes only a naive use of the 
GMAC. The latter is a multi-access strategy optimal for 
independent data coupled with an optimal distributed quan- 
tization scheme for change detection; it is suboptimal because 
it does not exploit the correlation in sensor observations. 
Our proposed algorithm exploits this correlation via transmit 
beamforming on the GMAC. Given the control feedback in 
our setting, optimal transmission strategies will change from 
channel to channel. Techniques based on separation principles 
are therefore likely to be suboptimal. 

Distributed transmit beamforming is crucial to realize our 
proposed scheme. The master-slave architecture of Mudumbai, 
Barriac & Madhow [1] and associated channel sensing tech- 
niques can be used for frequency and phase synchronization. 
Simulations with channel estimation errors indicate that the 
degradation due to lack of perfect channel knowledge is 
tolerable, making this analog technique a viable option for 
implementation. 

We then considered a constraint on the average energy 
expended instead of a power constraint. We demonstrated 
via simulation that this made better use of the scarce energy 
resource. Extensions to arbitrary but known distributions, in 
particular to the exponential family, and to M -ary hypotheses 
can be found in [21, Ch. 5]. 



Appendix I 
A Characterization of Optimal Control 

The following characterization of was used in iden- 

tifying the optimal controls. The characterization refers to a 
quantification of dissimilarity between probability measures 
called Ali-Silvey distances ([2]). Relative entropy (Kullback- 
Leibler divergence) is one example. Such dissimilarity mea- 
sures have a well-known monotonicity property: data pro- 
cessing, whether deterministic or random, cannot increase the 
dissimilarity measure between two distributions ([2], [25]). 
This characterization may be of interest in other sequential 
detection settings. 

Theorem 5: The minimization in (fT2l l is obtained via a 
maximization of an Ali-Silvey distance between the density 

functions frm,a and frno,a- 



Proof: We first show that the minimization in (fT2]) can 
be expressed as the maximization of an Ali-Silvey distance 



C ( 4){Y) ) between probability density functions (pdf) 



Pi and p2 where 



P2{y) _ fmi^y) 



pi{y) h{y,a,fi) 

and C is a convex function. To see this, observe that both 
and P2{-) are densities. The density pi is a mixture of 
pdfs under the two hypotheses while p2 is the pdf under Hi. 
Thus g{y, a, n)/h{y, a,fi) = (34>(y), where /3 = ^ + (1 - ^)p. 
From ( fT2l l. we have 



Aj{p) = miiiEp, J [my) 



= minE, 



gU{y) 



1 — max E n 



CU{Y) 



(21) 



where G{x) = J (fix) and C{x) = 1 — G{x). J is concave; 
so G is concave, C is convex, and ( 1211 1 is obtained via a 
maximization of an Ali-Silvey distance between pi and p2- 
Now, 



E„ 



CU{Y) 



E„ 



= E, 



P2 



E, 



P2 



c 



V2{Y) 
MY), 



PijY)^ P2{Y) 
_P2{Y) \pi{Y)^ 

Ci (4>'{Y)' 



(22) 



where (j)'{y) = Pi{y)/p2{v), and Ci{x) = xC{l/x) . Gi{x) 
is a convex function because G{x) is convex and x is nonneg- 
ative. Now, let p3(y) = f,no,a{y)- Since 

Pi{y)/P2iy) = /5 + (1 - (3)p3iy)/p2iy), 

it is clear that C2{x) = Ci (/? + (1 — /3)x) is a convex 
function. Setting <j)"{y) = P?,{y) / P2{y) , the likelihood ratio 
between the original two hypotheses, (l22l) can be written as 

E 

Irr 



•P2 



G2 \ 4'"iX)j ' an Ali-Silvey distance between Jmi,a and 
and the theorem follows. ■ 



Appendix II 
Proof of Lemma|4] 

Here we solve Problem [3] Order the indices so that 
^obs.i^i'^maxa < ' ' ' < o-obs^L^i-Q^max.L- Let us first add a 
constraint X^^^i ^i^i — where without loss of generality 
a e [0, amax], with Omax = I]f=i 'li^max,;, and solve the 
convex optimization problem: 

Problem 5: Minimize J2^=i "'obs i^f'^f subject to ai e 
[0, amaxj] , J^lLi him = a € [0, flmax] • 

This problem is a special case of a separable convex opti- 
mization problem studied in Padakandla and Sundaresan [26]. 
Execution of [26, Algorithm 1] yields the following solution. 
Break [0, amax] into L intervals [a^, Ofc+i], fc = 0, 1, . . . , L — 1, 
where oq = and 



ak 



''obs 



^^=l 



L \ 
;=fc+i / 
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The ordering of a^^^^ ;^/amax.; implies Um+i so that each 
interval is nonempty. With k such that a G [0^,0^+1], the 
optimal solution is: 



ai = 



ckmax.ij 
1 



I — 1, . . . , /c, 



a - 



m^max.m 



L^m—k 



(23) 
; > /c. (24) 



./m— fc+1 *^obS,m 

The corresponding minimum value of Problem |5] for a given 
a, denoted by V{a), is given by 

2 



;=i 



obs,i 



1 2 2 
'-'^max,; 



^obs,i 



We next look for an optimal a by solving 

Problem 6: Minimize /(a) = ^(°)+'^mac subject to a e 

[0, Q-niax] ■ 

While this is not yet a convex optimization, the transforma- 
tion b — 1/a casts it into one. Define 

1 

a 



E2 1,2 2 



'MAC" 



max.; 



1=1 



El 



l=k+l 



-^obs,; 



2b 



E 



,-2 



l=k+l ^obs,; 



-2 



/=fc+l "'obs.i 



for b e [l/flinax, 00), where k depends on b through the 
index of the interval in which a — 1/b lies. The following 
observations on g are easy to verify: 

• g{b) is a convex parabola on each [l/ak+i,l/ak],k ~ 
L — I, - ■ ■ ,1, and on [1/ai, 00); 

• g(b) is continuous in [l/flmax, 00). This needs checking 
only at interval boundaries 1/ak', 

• g{b) is continuously differentiable in (l/aniax,oo) with 
left continuous derivative at l/cmax; 

• lim^^oo = +00, so that the derivative eventually 
becomes positive for large 6. 

Since g is convex and continuously differentiable, if we can 
find a b* such that g'{b*) = and b* e [l/ofc+i, l/ofc] (or 
[l/ai,(X))) where k corresponds to a* — 1/b*, then b* is a 
point of global minimum. This holds if the minimum point for 
a parabola defined in [1/ak+i, l/ak] (or [1/ai, 00)), which is 
easily verified to be 



a* = 1/b* = ^ hiamax,i 
1=1 



X^L -2 
l^l=k+l "obsd 



Xo'obsj'i/amax,;)^ 



fMAC 



J2l=i ^iOfmaxJ 

also belongs to that interval. This leads to the condition ( fTsT l. If 
no such point occurs, g'{b) 7^ in [l/omax, 00), and since g' 
is eventually positive, it must be positive in the entire interval. 
In this latter case g is an increasing function on [l/flmax, 00) 
and the minimum is attained at b* = l/cmax or a* = a,„ax 



or fc = 
proof. 
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